Working Paper: Modeling Gender Discrimination by Audiences of Online News
نویسندگان
چکیده
The representation of women in public discourse—where they have historically been a minority—is important for fair, democratic societies. Although digital publishing has been heralded as a source of greater equality in women’s representation, it also creates opportunities for new forms of discrimination, e.g., from audiences on social media. In this working paper, we evaluate the hypothesis that online news audiences on social media like, share, and reshare articles by men and women at different rates. We fit three Poisson regression models that predict social media impressions (counts of likes, shares, and reshares) using a sample of 156,523 articles published by the Daily Mail, Guardian, and Telegraph from July 1, 2011 to June 30, 2012. Our models suggest that audiences like, share, and reshare articles by men and women differently. We explore these preliminary results and highlight one newspaper section where articles by women have an incidence rate of social media impressions that is 33% of the rate for articles by men. Our preliminary findings raise questions for further research on modeling gender discrimination by online audiences. INTRODUCTION The representation of women in public discourse is important for equal participation within democratic societies. For example, global studies have shown that cultural attitudes toward gender equality are a central element of democratization [14]. Media coverage of women is linked with political participation; when women take visible roles in politics, more women demonstrate political knowledge and vote [9]. Female role models also influence adolescents’ career decisions [32]. Although the representation of women in the news has increased over the past decade [19], gender inequality persists at the fundamental level of employment in news organizations. In the United States, for example, the journalism industry has failed to meet its own diversity hiring goals [1]. The percentage of women in US newsrooms has remained at 37% for the last 15 years [15], and the industry has maintained a trend of white male predominance that persists despite women outnumbering men in journalism schools since the 1980s [6]. Women have used the Internet to circumvent historical disparities, with parenting and feminist blogs gaining substantial visibility and power [18, 5, 24, 31]. Online publishing is inexpensive, the pool of voices is diverse, and institutional gatekeepers cannot prevent readers from accessing those voices [28]. For these reasons, proponents of online publishing have argued that by allowing citizen journalists and audiences to circumvent male-dominated institutions, online publishing broadens public conversation, making marginalized voices heard. Despite early hopes that the Internet might foster peace [10] and global understanding [33], a growing literature has observed the reproduction and perhaps expansion of gender disparities, sexism, racism, and oligarchy among creators of online content, most notably in open source software development [27], peer production [16], news comments [26], and the videogame industry [21]. However, debates on inequities of attention and content sharing among audiences have primarily focused on concerns of political echo chambers [30] and filter bubbles [25] rather than problems of prejudice and inequality. In this working paper,1 we model gender discrimination by audiences of online news and provide preliminary results. Using social media impressions—i.e., counts of shares, likes, and reshares across several platforms—as our dependent variable, we test the hypothesis that online news audiences share, like, and reshare articles authored by men and women differently. We carry out this preliminary analysis using three Poisson regression models for articles published by three UK news outlets from July, 2011 through the end of June, 2012. Finally, we provide an exploratory discussion of our preliminary results. MODELING DISCRIMINATION Quantitative research on inequality differentiates between discrimination and bias. In economics, research on discrimination focuses on situations where “members of a minority [or other marginalized group] are treated differently (less favorably) than members of a majority group with identical productive characteristics” [3], offering no account of the beliefs or attitudes involved in discrimination [7]. Conversely, research on prejudice and bias focuses on measuring and explaining the reasons for behaviors that produce discrimination, often through social psychology and psychometrics methods [23]. Here, we explore differences between the rates that online news audiences like, share, and reshare articles by men and women. We do not discuss the reasons for these differences, focusing on discrimination rather than prejudice or bias. DATA COLLECTION Our data set includes 314,771 articles published online by the Guardian, Telegraph, and Daily Mail newspapers from July 1, 2011 to June 30, 2012. We obtained 143,515 Guardian articles through the Guardian OpenPlatform API. We scraped 110,029 Telegraph articles and 61,228 Daily Mail articles from their websites’ daily archive pages. For the Guardian, we extracted metadata, including URLs, bylines, dates, sections, and titles 1Since this working paper describes work that is currently in progress, please do not cite it without first contacting the authors. from the Guardian API. For the other two newspapers, we extracted metadata from article URLs and page contents. We obtained the number of of likes, shares, and reshares for each article by querying Facebook, Twitter, and Google Plus in August 2012, at least one month after the publication of every article in our data set. We made these queries using the Mozilla Amo social media query system.2 We refer to the total number of counts (i.e., likes plus shares plus reshares) for each article as that article’s social media impressions. We obtained byline gender for each article by extracting names from each article’s byline and then coding these names for gender using automated techniques based on UK birth records [20, 11]. Our techniques are similar to those used in other quantitative studies of gender disparities online [26]. We labeled each byline as male if only male-identified names were present, female if only female-identified names were present, and both if men and women appeared as co-authors of the article. If a byline contained no author names (e.g., “Associated Press”), or where gender could not be identified using our automated techniques, we labeled it as unknown. We obtained Guardian article sections from the Guardian API. We obtained Daily Mail and Telegraph article sections from topic designations in article URLs. For comparability across outlets, we coded sections (in consultation with multiple UK journalists) into a scheme that consists of nine categories: arts/culture, entertainment, lifestyle, money/finance, news, opinion, science/technology, special audience,3 and sport. In the rest of this working paper, we focus on a subset of 156,523 articles—26,340 Daily Mail articles, 69,597 Guardian articles, and 60,586 Telegraph articles. These articles all have bylines that were coded as either male or female and they all appeared in one the following (coded) newspaper sections: sport, science/tech, opinion, news, money/finance, and lifestyle. MODELING SOCIAL MEDIA IMPRESSIONS To test the hypothesis that there are byline gender differences by section in articles’ social media impressions, we fit a multilevel, random-intercepts regression model for each newspaper. Dependent Variable: Social Media Impressions We used the articles’ social media impressions as our dependent variable. Social media impressions for Daily Mail articles range from 0 to 95,350, with a mean of 129 and a median of 19. For Guardian articles, social media impressions range from 0 to 196,300, with a mean of 188 and a median of 53. Telegraph articles had social media impressions that range from 0 to 56,840, with a mean of 73 and a median of 28. Since our dependent variable is a count (i.e., positive integer), we chose to model our data using a Poisson regression framework [17]. Covariates The covariates that we included in each of our models are listed in table 1. As well as including a byline-level binary 2Amo was written by Cole Gillespie of Mozilla OpenNews Labs: https://github.com/OpenNewsLabs/amo/ 3The special audience category includes commissioned articles and other content paid for by funders and corporations. Table 1. Covariates used in all three models Covariate Description Type Covariate Description Type Article-Level Covariates Byline-Level Covariates X1ia log(title length) real-valued X14a female binary X2ia log(title length) 2 real-valued X15a log(total articles) real-valued X3ia Tuesday binary X4ia Wednesday binary X5ia Thursday binary X6ia Friday binary X7ia Saturday binary X8ia Sunday binary Interaction Covariates X9ia lifestyle binary X16ia female× lifestyle binary X10ia money/finance binary X17ia female×money/finance binary X11ia opinion binary X18ia female×opinion binary X12ia science/tech binary X19ia female× science/tech binary X13ia sport binary X20ia female× sport binary covariate for gender and an article-level categorical covariate for newspaper section (with news as the reference section), we also included several other covariates, described below. Since article titles offer key information that readers use in their decision to click on or share an article, we controlled for the order of magnitude of title length as an article-level covariate. Since we expected a nonlinear relationship, where very short and very long titles are less likely to be shared, we included this covariate in both linear and squared forms. Journalists often report an anecdotal relationship between the day of the week on which an article is published and its popularity. We included day of the week as an article-level categorical control covariate, with Monday as the reference day. Journalists vary in their experience and publication frequency. To control for this, we included a byline-level covariate for the total number of articles by that author in our data set. Between-Byline Variation in Social Media Impressions During the time period spanned by our data set, the people whose articles were published in the Guardian, Telegraph, and Daily Mail included politicians, first-time writers, television personalities, and sporting celebrities, as well as professional journalists with varying levels of experience and notability. Since some of these people are better known than others, and since our research question concerns gender differences between bylines, we fit a multilevel, random-intercepts Poisson regression model that accounts for variation between bylines. FINDINGS By fitting a multilevel, random-intercepts Poisson regression model for each newspaper, we found that social media impressions do differ by gender and that this difference varies with newspaper section. Our results are summarized in table 2. In some newspaper sections, the magnitude of the difference in social media impressions by gender is very large, often favoring articles written by men. The exponential of each coefficient in a Poisson regression model is typically interpreted as an incidence rate ratio—i.e., the expected multiplicative increase in the dependent variable for a unit change in corresponding covariate, holding the other covariates constant. For the Daily Mail, an article by a woman in the sports section has an incidence rate of social media impressions that is 33% of the incidence rate for an article by a man. For the Telegraph, a news article by a woman has an incidence rate of social media impressions that is 86% of the incidence rate for an article by a man. An article in the money/finance section of the Guardian Table 2. Per-newspaper multilevel models for social media impressions. Dependent Variable: Social Media Impressions Daily Mail Guardian Telegraph Article-Level Predictors log(title length) 1.461∗∗∗ 0.615∗∗∗ 0.078∗∗∗ (0.024) (0.005) (0.007) log(title length)2 −0.163∗∗∗ −0.161∗∗∗ −0.016∗∗∗ (0.004) (0.001) (0.002) Tuesday 0.149∗∗∗ −0.121∗∗∗ −0.056∗∗∗ (0.002) (0.001) (0.002) Wednesday 0.143∗∗∗ −0.068∗∗∗ −0.200∗∗∗ (0.002) (0.001) (0.002) Thursday 0.009∗∗∗ −0.104∗∗∗ −0.085∗∗∗ (0.002) (0.001) (0.002) Friday 0.019∗∗∗ −0.068∗∗∗ 0.038∗∗∗ (0.002) (0.001) (0.002) Saturday 0.255∗∗∗ 0.170∗∗∗ 0.023∗∗∗ (0.003) (0.002) (0.002) Sunday 0.480∗∗∗ 0.158∗∗∗ 0.162∗∗∗ (0.002) (0.001) (0.002) lifestyle 0.535∗∗∗ −0.043∗∗∗ 0.001 (0.005) (0.003) (0.003) money/finance −2.622∗∗∗ −0.186∗∗∗ −0.416∗∗∗ (0.013) (0.003) (0.003) opinion −0.577∗∗∗ 0.261∗∗∗ 0.062∗∗∗ (0.008) (0.002) (0.003) science & Tech 0.209∗∗∗ 0.541∗∗∗ 0.268∗∗∗ (0.003) (0.002) (0.003) sport 0.836∗∗∗ −0.470∗∗∗ −0.181∗∗∗ (0.013) (0.004) (0.006) Byline-Level Covariates log(total articles) 0.208∗∗∗ 0.100∗∗∗ 0.130∗∗∗ (0.029) (0.014) (0.016) female 0.391∗∗∗ 0.089∗∗∗ −0.155∗∗∗ (0.081) (0.031) (0.052) Interaction Covariates female × lifestyle −0.558∗∗∗ 0.362∗∗∗ 0.196∗∗∗ (0.007) (0.005) (0.006) female × money/finance −0.048∗∗ −0.386∗∗∗ −0.119∗∗∗ (0.024) (0.005) (0.007) female × opinion 0.247∗∗∗ −0.098∗∗∗ −0.135∗∗∗ (0.024) (0.004) (0.007) female × science/tech 0.130∗∗∗ 0.059∗∗∗ −0.045∗∗∗ (0.006) (0.003) (0.006) female × sport −1.511∗∗∗ −0.077∗∗∗ −0.265∗∗∗ (0.035) (0.009) (0.012)
منابع مشابه
Audience Clicks and News Placement: A Study of Time- Lagged Influence in Online Journalism
The rise of sophisticated tools for tracking audiences online has begun to change the way media producers think about media audiences. This study examines this phenomenon in journalism, building on a revised theoretical model that accounts for greater audience engagement in the gatekeeping process. Research suggests that news editors, after long resisting or ignoring audience preferences, are b...
متن کاملA Study on News Anchors’ Meta-Language and Non-Verbal Factors and their Impact on Audiences
Non-verbal communication or body messaging occurs when facial expressions, tone of voice, head and neck movements, smiling and ... affects others; which may be intentional or unintentional. Farhangi in nonverbal communication: the art of using movement and sound” defines this field as such: "Non-verbal communication is phonetic and non-phonetic messages which have been explained by other than l...
متن کاملAudience Preference and Editorial Judgment: A Study of Time-Lagged Influence in Online News
The rise of sophisticated tools for tracking audiences online has begun to change the way media producers think about media audiences. This study examines this phenomenon in journalism. Research suggests that journalists, after long resisting or ignoring audience preferences, are becoming increasingly aware of user desires, manifest via metrics. However, research also finds a gap in the news pr...
متن کاملThematic analysis of the news of the 2020 Tokyo Olympics with emphasis on gender(case study: Shargh news paper)
abstract: The purpose of writing this article is to thematically analyze the news of the 2020 Tokyo Olympics by emphasizing gender and presenting an indigenous model of its related components using the theories of experts. The text of the Tokyo 2020 Olympic event is in Shargh 1400 newspaper (August 1 - August 17) which is a purposeful sampling, first based on commonalities, related them...
متن کاملA Comparative Review of Hijab Discovery News Coverage in News Media
Purpose: News media play an important role in attitude towards various issues including hijab and hijab discovery. As a result, the purpose of this research was comparative review of hijab discovery news coverage in news media. Methodology: This study in terms of purpose was applied and in terms of implementation method was quantitative. The research population was the hijab discovery news in ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015